Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
J Biomed Inform ; 135: 104235, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36283581

RESUMO

OBJECTIVE: The free-text Condition data field in the ClinicalTrials.gov is not amenable to computational processes for retrieving, aggregating and visualizing clinical studies by condition categories. This paper contributes a method for automated ontology-based categorization of clinical studies by their conditions. MATERIALS AND METHODS: Our method first maps text entries in ClinicalTrials.gov's Condition field to standard condition concepts in the OMOP Common Data Model by using SNOMED CT as a reference ontology and using Usagi for concept normalization, followed by hierarchical traversal of the SNOMED ontology for concept expansion, ontology-driven condition categorization, and visualization. We compared the accuracy of this method to that of the MeSH-based method. RESULTS: We reviewed the 4,506 studies on Vivli.org categorized by our method. Condition terms of 4,501 (99.89%) studies were successfully mapped to SNOMED CT concepts, and with a minimum concept mapping score threshold, 4,428 (98.27%) studies were categorized into 31 predefined categories. When validating with manual categorization results on a random sample of 300 studies, our method achieved an estimated categorization accuracy of 95.7%, while the MeSH-based method had an accuracy of 85.0%. CONCLUSION: We showed that categorizing clinical studies using their Condition terms with referencing to SNOMED CT achieved a better accuracy and coverage than using MeSH terms. The proposed ontology-driven condition categorization was useful to create accurate clinical study categorization that enables clinical researchers to aggregate evidence from a large number of clinical studies.


Assuntos
Medical Subject Headings , Systematized Nomenclature of Medicine , Visualização de Dados
2.
JMIR Mhealth Uhealth ; 10(2): e31048, 2022 02 10.
Artigo em Inglês | MEDLINE | ID: mdl-35142627

RESUMO

Person-generated data (PGD) are a valuable source of information on a person's health state in daily life and in between clinic visits. To fully extract value from PGD, health care organizations must be able to smoothly integrate data from PGD devices into routine clinical workflows. Ideally, to enhance efficiency and flexibility, such integrations should follow reusable processes that can easily be replicated for multiple devices and data types. Instead, current PGD integrations tend to be one-off efforts entailing high costs to build and maintain custom connections with each device and their proprietary data formats. This viewpoint paper formulates the integration of PGD into clinical systems and workflow as a PGD integration pipeline and reviews the functional components of such a pipeline. A PGD integration pipeline includes PGD acquisition, aggregation, and consumption. Acquisition is the person-facing component that includes both technical (eg, sensors, smartphone apps) and policy components (eg, informed consent). Aggregation pools, standardizes, and structures data into formats that can be used in health care settings such as within electronic health record-based workflows. PGD consumption is wide-ranging, by different solutions in different care settings (inpatient, outpatient, consumer health) for different types of users (clinicians, patients). The adoption of data and metadata standards, such as those from IEEE and Open mHealth, would facilitate aggregation and enable broader consumption. We illustrate the benefits of a standards-based integration pipeline for the illustrative use case of home blood pressure monitoring. A standards-based PGD integration pipeline can flexibly streamline the clinical use of PGD while accommodating the complexity, scale, and rapid evolution of today's health care systems.


Assuntos
Aplicativos Móveis , Telemedicina , Atenção à Saúde , Registros Eletrônicos de Saúde , Humanos , Padrões de Referência
3.
J Med Internet Res ; 23(11): e34493, 2021 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-34751656

RESUMO

Data integration, the processes by which data are aggregated, combined, and made available for use, has been key to the development and growth of many technological solutions. In health care, we are experiencing a revolution in the use of sensors to collect data on patient behaviors and experiences. Yet, the potential of this data to transform health outcomes is being held back. Deficits in standards, lexicons, data rights, permissioning, and security have been well documented, less so the cultural adoption of sensor data integration as a priority for large-scale deployment and impact on patient lives. The use and reuse of trustworthy data to make better and faster decisions across drug development and care delivery will require an understanding of all stakeholder needs and best practices to ensure these needs are met. The Digital Medicine Society is launching a new multistakeholder Sensor Data Integration Tour of Duty to address these challenges and more, providing a clear direction on how sensor data can fulfill its potential to enhance patient lives.


Assuntos
Coleta de Dados , Atenção à Saúde , Humanos , Tecnologia
4.
J Med Internet Res ; 23(9): e29875, 2021 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-34524089

RESUMO

BACKGROUND: Digital clinical measures collected via various digital sensing technologies such as smartphones, smartwatches, wearables, ingestibles, and implantables are increasingly used by individuals and clinicians to capture health outcomes or behavioral and physiological characteristics of individuals. Although academia is taking an active role in evaluating digital sensing products, academic contributions to advancing the safe, effective, ethical, and equitable use of digital clinical measures are poorly characterized. OBJECTIVE: We performed a systematic review to characterize the nature of academic research on digital clinical measures and to compare and contrast the types of sensors used and the sources of funding support for specific subareas of this research. METHODS: We conducted a PubMed search using a range of search terms to retrieve peer-reviewed articles reporting US-led academic research on digital clinical measures between January 2019 and February 2021. We screened each publication against specific inclusion and exclusion criteria. We then identified and categorized research studies based on the types of academic research, sensors used, and funding sources. Finally, we compared and contrasted the funding support for these specific subareas of research and sensor types. RESULTS: The search retrieved 4240 articles of interest. Following the screening, 295 articles remained for data extraction and categorization. The top five research subareas included operations research (research analysis; n=225, 76%), analytical validation (n=173, 59%), usability and utility (data visualization; n=123, 42%), verification (n=93, 32%), and clinical validation (n=83, 28%). The three most underrepresented areas of research into digital clinical measures were ethics (n=0, 0%), security (n=1, 0.5%), and data rights and governance (n=1, 0.5%). Movement and activity trackers were the most commonly studied sensor type, and physiological (mechanical) sensors were the least frequently studied. We found that government agencies are providing the most funding for research on digital clinical measures (n=192, 65%), followed by independent foundations (n=109, 37%) and industries (n=56, 19%), with the remaining 12% (n=36) of these studies completely unfunded. CONCLUSIONS: Specific subareas of academic research related to digital clinical measures are not keeping pace with the rapid expansion and adoption of digital sensing products. An integrated and coordinated effort is required across academia, academic partners, and academic funders to establish the field of digital clinical measures as an evidence-based field worthy of our trust.


Assuntos
Atenção à Saúde , Smartphone , Humanos
5.
Sci Data ; 7(1): 281, 2020 08 27.
Artigo em Inglês | MEDLINE | ID: mdl-32855408

RESUMO

We present Chia, a novel, large annotated corpus of patient eligibility criteria extracted from 1,000 interventional, Phase IV clinical trials registered in ClinicalTrials.gov. This dataset includes 12,409 annotated eligibility criteria, represented by 41,487 distinctive entities of 15 entity types and 25,017 relationships of 12 relationship types. Each criterion is represented as a directed acyclic graph, which can be easily transformed into Boolean logic to form a database query. Chia can serve as a shared benchmark to develop and test future machine learning, rule-based, or hybrid methods for information extraction from free-text clinical trial eligibility criteria.


Assuntos
Ensaios Clínicos Fase IV como Assunto , Humanos
6.
J Clin Epidemiol ; 115: 77-89, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31302205

RESUMO

OBJECTIVES: Data Abstraction Assistant (DAA) is a software for linking items abstracted into a data collection form for a systematic review to their locations in a study report. We conducted a randomized cross-over trial that compared DAA-facilitated single-data abstraction plus verification ("DAA verification"), single data abstraction plus verification ("regular verification"), and independent dual data abstraction plus adjudication ("independent abstraction"). STUDY DESIGN AND SETTING: This study is an online randomized cross-over trial with 26 pairs of data abstractors. Each pair abstracted data from six articles, two per approach. Outcomes were the proportion of errors and time taken. RESULTS: Overall proportion of errors was 17% for DAA verification, 16% for regular verification, and 15% for independent abstraction. DAA verification was associated with higher odds of errors when compared with regular verification (adjusted odds ratio [OR] = 1.08; 95% confidence interval [CI]: 0.99-1.17) or independent abstraction (adjusted OR = 1.12; 95% CI: 1.03-1.22). For each article, DAA verification took 20 minutes (95% CI: 1-40) longer than regular verification, but 46 minutes (95% CI: 26 to 66) shorter than independent abstraction. CONCLUSION: Independent abstraction may only be necessary for complex data items. DAA provides an audit trail that is crucial for reproducible research.


Assuntos
Indexação e Redação de Resumos/métodos , Revisões Sistemáticas como Assunto , Estudos Cross-Over , Coleta de Dados , Humanos , Razão de Chances , Distribuição Aleatória , Software , Adulto Jovem
7.
Syst Rev ; 5(1): 196, 2016 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-27876082

RESUMO

BACKGROUND: Data abstraction, a critical systematic review step, is time-consuming and prone to errors. Current standards for approaches to data abstraction rest on a weak evidence base. We developed the Data Abstraction Assistant (DAA), a novel software application designed to facilitate the abstraction process by allowing users to (1) view study article PDFs juxtaposed to electronic data abstraction forms linked to a data abstraction system, (2) highlight (or "pin") the location of the text in the PDF, and (3) copy relevant text from the PDF into the form. We describe the design of a randomized controlled trial (RCT) that compares the relative effectiveness of (A) DAA-facilitated single abstraction plus verification by a second person, (B) traditional (non-DAA-facilitated) single abstraction plus verification by a second person, and (C) traditional independent dual abstraction plus adjudication to ascertain the accuracy and efficiency of abstraction. METHODS: This is an online, randomized, three-arm, crossover trial. We will enroll 24 pairs of abstractors (i.e., sample size is 48 participants), each pair comprising one less and one more experienced abstractor. Pairs will be randomized to abstract data from six articles, two under each of the three approaches. Abstractors will complete pre-tested data abstraction forms using the Systematic Review Data Repository (SRDR), an online data abstraction system. The primary outcomes are (1) proportion of data items abstracted that constitute an error (compared with an answer key) and (2) total time taken to complete abstraction (by two abstractors in the pair, including verification and/or adjudication). DISCUSSION: The DAA trial uses a practical design to test a novel software application as a tool to help improve the accuracy and efficiency of the data abstraction process during systematic reviews. Findings from the DAA trial will provide much-needed evidence to strengthen current recommendations for data abstraction approaches. TRIAL REGISTRATION: The trial is registered at National Information Center on Health Services Research and Health Care Technology (NICHSR) under Registration # HSRP20152269: https://wwwcf.nlm.nih.gov/hsr_project/view_hsrproj_record.cfm?NLMUNIQUE_ID=20152269&SEARCH_FOR=Tianjing%20Li . All items from the World Health Organization Trial Registration Data Set are covered at various locations in this protocol. Protocol version and date: This is version 2.0 of the protocol, dated September 6, 2016. As needed, we will communicate any protocol amendments to the Institutional Review Boards (IRBs) of Johns Hopkins Bloomberg School of Public Health (JHBSPH) and Brown University. We also will make appropriate as-needed modifications to the NICHSR website in a timely fashion.


Assuntos
Indexação e Redação de Resumos , Software , Revisões Sistemáticas como Assunto , Medicina Baseada em Evidências/métodos , Humanos
8.
J Biomed Inform ; 60: 66-76, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26820188

RESUMO

OBJECTIVE: To develop a multivariate method for quantifying the population representativeness across related clinical studies and a computational method for identifying and characterizing underrepresented subgroups in clinical studies. METHODS: We extended a published metric named Generalizability Index for Study Traits (GIST) to include multiple study traits for quantifying the population representativeness of a set of related studies by assuming the independence and equal importance among all study traits. On this basis, we compared the effectiveness of GIST and multivariate GIST (mGIST) qualitatively. We further developed an algorithm called "Multivariate Underrepresented Subgroup Identification" (MAGIC) for constructing optimal combinations of distinct value intervals of multiple traits to define underrepresented subgroups in a set of related studies. Using Type 2 diabetes mellitus (T2DM) as an example, we identified and extracted frequently used quantitative eligibility criteria variables in a set of clinical studies. We profiled the T2DM target population using the National Health and Nutrition Examination Survey (NHANES) data. RESULTS: According to the mGIST scores for four example variables, i.e., age, HbA1c, BMI, and gender, the included observational T2DM studies had superior population representativeness than the interventional T2DM studies. For the interventional T2DM studies, Phase I trials had better population representativeness than Phase III trials. People at least 65years old with HbA1c value between 5.7% and 7.2% were particularly underrepresented in the included T2DM trials. These results confirmed well-known knowledge and demonstrated the effectiveness of our methods in population representativeness assessment. CONCLUSIONS: mGIST is effective at quantifying population representativeness of related clinical studies using multiple numeric study traits. MAGIC identifies underrepresented subgroups in clinical studies. Both data-driven methods can be used to improve the transparency of design bias in participation selection at the research community level.


Assuntos
Algoritmos , Pesquisa Biomédica/normas , Demografia/métodos , Viés de Seleção , Ensaios Clínicos como Assunto , Bases de Dados Factuais , Diabetes Mellitus Tipo 2 , Humanos , Computação em Informática Médica , Análise Multivariada , Inquéritos Nutricionais , Estudos Observacionais como Assunto , Seleção de Pacientes
9.
J Biomed Inform ; 54: 241-55, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25615940

RESUMO

OBJECTIVE: To develop a method for profiling the collective populations targeted for recruitment by multiple clinical studies addressing the same medical condition using one eligibility feature each time. METHODS: Using a previously published database COMPACT as the backend, we designed a scalable method for visual aggregate analysis of clinical trial eligibility features. This method consists of four modules for eligibility feature frequency analysis, query builder, distribution analysis, and visualization, respectively. This method is capable of analyzing (1) frequently used qualitative and quantitative features for recruiting subjects for a selected medical condition, (2) distribution of study enrollment on consecutive value points or value intervals of each quantitative feature, and (3) distribution of studies on the boundary values, permissible value ranges, and value range widths of each feature. All analysis results were visualized using Google Charts API. Five recruited potential users assessed the usefulness of this method for identifying common patterns in any selected eligibility feature for clinical trial participant selection. RESULTS: We implemented this method as a Web-based analytical system called VITTA (Visual Analysis Tool of Clinical Study Target Populations). We illustrated the functionality of VITTA using two sample queries involving quantitative features BMI and HbA1c for conditions "hypertension" and "Type 2 diabetes", respectively. The recruited potential users rated the user-perceived usefulness of VITTA with an average score of 86.4/100. CONCLUSIONS: We contributed a novel aggregate analysis method to enable the interrogation of common patterns in quantitative eligibility criteria and the collective target populations of multiple related clinical studies. A larger-scale study is warranted to formally assess the usefulness of VITTA among clinical investigators and sponsors in various therapeutic areas.


Assuntos
Pesquisa Biomédica/métodos , Ensaios Clínicos como Assunto/métodos , Mineração de Dados/métodos , Internet , Seleção de Pacientes , Bases de Dados Factuais , Feminino , Humanos , Masculino , Modelos Teóricos
10.
J Biomed Inform ; 52: 78-91, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24239612

RESUMO

To date, the scientific process for generating, interpreting, and applying knowledge has received less informatics attention than operational processes for conducting clinical studies. The activities of these scientific processes - the science of clinical research - are centered on the study protocol, which is the abstract representation of the scientific design of a clinical study. The Ontology of Clinical Research (OCRe) is an OWL 2 model of the entities and relationships of study design protocols for the purpose of computationally supporting the design and analysis of human studies. OCRe's modeling is independent of any specific study design or clinical domain. It includes a study design typology and a specialized module called ERGO Annotation for capturing the meaning of eligibility criteria. In this paper, we describe the key informatics use cases of each phase of a study's scientific lifecycle, present OCRe and the principles behind its modeling, and describe applications of OCRe and associated technologies to a range of clinical research use cases. OCRe captures the central semantics that underlies the scientific processes of clinical research and can serve as an informatics foundation for supporting the entire range of knowledge activities that constitute the science of clinical research.


Assuntos
Ontologias Biológicas , Pesquisa Biomédica , Informática Médica , Biologia Computacional , Medicina Baseada em Evidências , Humanos , Modelos Teóricos
11.
AMIA Annu Symp Proc ; 2014: 1777-86, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25954450

RESUMO

ClinicalTrials.gov presents great opportunities for analyzing commonalities in clinical trial target populations to facilitate knowledge reuse when designing eligibility criteria of future trials or to reveal potential systematic biases in selecting population subgroups for clinical research. Towards this goal, this paper presents a novel data resource for enabling such analyses. Our method includes two parts: (1) parsing and indexing eligibility criteria text; and (2) mining common eligibility features and attributes of common numeric features (e.g., A1c). We designed and built a database called "Commonalities in Target Populations of Clinical Trials" (COMPACT), which stores structured eligibility criteria and trial metadata in a readily computable format. We illustrate its use in an example analytic module called CONECT using COMPACT as the backend. Type 2 diabetes is used as an example to analyze commonalities in the target populations of 4,493 clinical trials on this disease.


Assuntos
Ensaios Clínicos como Assunto , Bases de Dados Factuais , Definição da Elegibilidade/classificação , Humanos , Seleção de Pacientes , Sistema de Registros
12.
Contemp Clin Trials ; 34(2): 348-55, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23380028

RESUMO

OBJECTIVE: We examine the extent to which ClinicalTrials.gov is meeting its goal of providing oversight and transparency of clinical trials with human subjects. We analyzed the ClinicalTrials.gov database contents as of June 2011, comparing interventions, medical conditions, and trial characteristics by sponsor type. We also conducted a detailed analysis of incomplete data. RESULTS: Among trials with only government sponsorship (N=9252), 36% were observational and 64% interventional; in contrast, almost all (90%) industry-only sponsored trials were interventional. Industry-only sponsored interventional trials (N=30,036) were most likely to report a drug intervention (81%), followed by biologics (9%) and devices (8%). Government-only interventional trials (N=5886) were significantly more likely to test behavioral interventions (28%) and procedures (13%) than industry-only trials (p<0.001). Medical conditions most frequently studied in industry-only trials were cancer (19%), cardiovascular conditions (12%) and endocrine/metabolic disorders (11%). Government-only funded trials were more likely to study mental health (19% vs. 7% for industry, p<.001), and viral infections, including HIV (15% vs 7% for industry, p<.001). Government-funded studies are significantly more likely to be missing data about study design and intervention arms in the registry. For all studies, we report ambiguous and contradictory data entries. CONCLUSIONS: Industry-sponsored studies differ systematically from government-sponsored studies in study type, choice of interventions, conditions studied, and completeness of submitted information. Imprecise study design information, incomplete coding of conditions, out-of-date or unspecified enrollment numbers, and other missing data continue to hinder robust analyses of trials registered in ClinicalTrials.gov.


Assuntos
Ensaios Clínicos como Assunto/estatística & dados numéricos , Bases de Dados Factuais/estatística & dados numéricos , Organização do Financiamento/estatística & dados numéricos , Sistema de Registros , Ensaios Clínicos como Assunto/economia , Indústria Farmacêutica/economia , Indústria Farmacêutica/estatística & dados numéricos , Financiamento Governamental/estatística & dados numéricos , Setor de Assistência à Saúde/economia , Setor de Assistência à Saúde/estatística & dados numéricos , Humanos , Projetos de Pesquisa/estatística & dados numéricos
13.
Artigo em Inglês | MEDLINE | ID: mdl-22779055

RESUMO

Effective clinical text processing requires accurate extraction and representation of temporal expressions. Multiple temporal information extraction models were developed but a similar need for extracting temporal expressions in eligibility criteria (e.g., for eligibility determination) remains. We identified the temporal knowledge representation requirements of eligibility criteria by reviewing 100 temporal criteria. We developed EliXR-TIME, a frame-based representation designed to support semantic annotation for temporal expressions in eligibility criteria by reusing applicable classes from well-known clinical temporal knowledge representations. We used EliXR-TIME to analyze a training set of 50 new temporal eligibility criteria. We evaluated EliXR-TIME using an additional random sample of 20 eligibility criteria with temporal expressions that have no overlap with the training data, yielding 92.7% (76 / 82) inter-coder agreement on sentence chunking and 72% (72 / 100) agreement on semantic annotation. We conclude that this knowledge representation can facilitate semantic annotation of the temporal expressions in eligibility criteria.

14.
AMIA Annu Symp Proc ; 2012: 681-9, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23304341

RESUMO

An abstraction network is an auxiliary network of nodes and links that provides a compact, high-level view of an ontology. Such a view lends support to ontology orientation, comprehension, and quality-assurance efforts. A methodology is presented for deriving a kind of abstraction network, called a partial-area taxonomy, for the Ontology of Clinical Research (OCRe). OCRe was selected as a representative of ontologies implemented using the Web Ontology Language (OWL) based on shared domains. The derivation of the partial-area taxonomy for the Entity hierarchy of OCRe is described. Utilizing the visualization of the content and structure of the hierarchy provided by the taxonomy, the Entity hierarchy is audited, and several errors and inconsistencies in OCRe's modeling of its domain are exposed. After appropriate corrections are made to OCRe, a new partial-area taxonomy is derived. The generalizability of the paradigm of the derivation methodology to various families of biomedical ontologies is discussed.


Assuntos
Pesquisa Biomédica/classificação , Vocabulário Controlado , Informática Médica
15.
AMIA Annu Symp Proc ; 2012: 856-65, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23304360

RESUMO

Human studies are one of the most valuable sources of knowledge in biomedical research, but data about their design and results are currently widely dispersed in siloed systems. Federation of these data is needed to facilitate large-scale data analysis to realize the goals of evidence-based medicine. The Human Studies Database project has developed an informatics infrastructure for federated query of human studies databases, using a generalizable approach to ontology-based data access. Our approach has three main components. First, the Ontology of Clinical Research (OCRe) provides the reference semantics. Second, a data model, automatically derived from OCRe into XSD, maintains semantic synchrony of the underlying representations while facilitating data acquisition using common XML technologies. Finally, the Query Integrator issues queries distributed over the data, OCRe, and other ontologies such as SNOMED in BioPortal. We report on a demonstration of this infrastructure on data acquired from institutional systems and from ClinicalTrials.gov.


Assuntos
Ensaios Clínicos como Assunto , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Experimentação Humana , Humanos , Linguagens de Programação , Vocabulário Controlado
16.
J Biomed Inform ; 44(2): 239-50, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-20851207

RESUMO

Formalizing eligibility criteria in a computer-interpretable language would facilitate eligibility determination for study subjects and the identification of studies on similar patient populations. Because such formalization is extremely labor intensive, we transform the problem from one of fully capturing the semantics of criteria directly in a formal expression language to one of annotating free-text criteria in a format called ERGO annotation. The annotation can be done manually, or it can be partially automated using natural-language processing techniques. We evaluated our approach in three ways. First, we assessed the extent to which ERGO annotations capture the semantics of 1000 eligibility criteria randomly drawn from ClinicalTrials.gov. Second, we demonstrated the practicality of the annotation process in a feasibility study. Finally, we demonstrate the computability of ERGO annotation by using it to (1) structure a library of eligibility criteria, (2) search for studies enrolling specified study populations, and (3) screen patients for potential eligibility for a study. We therefore demonstrate a new and practical method for incrementally capturing the semantics of free-text eligibility criteria into computable form.


Assuntos
Definição da Elegibilidade/métodos , Semântica , Ensaios Clínicos como Assunto , Biologia Computacional , Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Vocabulário Controlado
17.
BMC Med Inform Decis Mak ; 10: 56, 2010 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-20920176

RESUMO

BACKGROUND: Clinical trials are one of the most important sources of evidence for guiding evidence-based practice and the design of new trials. However, most of this information is available only in free text - e.g., in journal publications - which is labour intensive to process for systematic reviews, meta-analyses, and other evidence synthesis studies. This paper presents an automatic information extraction system, called ExaCT, that assists users with locating and extracting key trial characteristics (e.g., eligibility criteria, sample size, drug dosage, primary outcomes) from full-text journal articles reporting on randomized controlled trials (RCTs). METHODS: ExaCT consists of two parts: an information extraction (IE) engine that searches the article for text fragments that best describe the trial characteristics, and a web browser-based user interface that allows human reviewers to assess and modify the suggested selections. The IE engine uses a statistical text classifier to locate those sentences that have the highest probability of describing a trial characteristic. Then, the IE engine's second stage applies simple rules to these sentences to extract text fragments containing the target answer. The same approach is used for all 21 trial characteristics selected for this study. RESULTS: We evaluated ExaCT using 50 previously unseen articles describing RCTs. The text classifier (first stage) was able to recover 88% of relevant sentences among its top five candidates (top5 recall) with the topmost candidate being relevant in 80% of cases (top1 precision). Precision and recall of the extraction rules (second stage) were 93% and 91%, respectively. Together, the two stages of the extraction engine were able to provide (partially) correct solutions in 992 out of 1050 test tasks (94%), with a majority of these (696) representing fully correct and complete answers. CONCLUSIONS: Our experiments confirmed the applicability and efficacy of ExaCT. Furthermore, they demonstrated that combining a statistical method with 'weak' extraction rules can identify a variety of study characteristics. The system is flexible and can be extended to handle other characteristics and document types (e.g., study protocols).


Assuntos
Armazenamento e Recuperação da Informação/métodos , Publicações Periódicas como Assunto , Ensaios Clínicos Controlados Aleatórios como Assunto , Humanos , Armazenamento e Recuperação da Informação/normas , Reprodutibilidade dos Testes
18.
Summit Transl Bioinform ; 2010: 46-50, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-21347148

RESUMO

Formal, computer-interpretable representations of eligibility criteria would allow computers to better support key clinical research and care use cases such as eligibility determination. To inform the development of such formal representations for eligibility criteria, we conducted this study to characterize and quantify the complexity present in 1000 eligibility criteria randomly selected from studies in ClinicalTrials.gov. We classified the criteria by their complexity, semantic patterns, clinical content, and data sources. Our analyses revealed significant semantic and clinical content variability. We found that 93% of criteria were comprehensible, with 85% of these criteria having significant semantic complexity, including 40% relying on temporal data. We also identified several domains of clinical content. Using the findings of the study as requirements for computer-interpretable representations of eligibility, we discuss the challenges for creating such representations for use in clinical research and practice.

19.
Summit Transl Bioinform ; 2010: 51-5, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-21347149

RESUMO

Human studies, encompassing interventional and observational studies, are the most important source of evidence for advancing our understanding of health, disease, and treatment options. To promote discovery, the design and results of these studies should be made machine-readable for large-scale data mining, synthesis, and re-analysis. The Human Studies Database Project aims to define and implement an informatics infrastructure for institutions to share the design of their human studies. We have developed the Ontology of Clinical Research (OCRe) to model study features such as design type, interventions, and outcomes to support scientific query and analysis. We are using OCRe as the reference semantics for federated data sharing of human studies over caGrid, and are piloting this implementation with several Clinical and Translational Science Award (CTSA) institutions.

20.
Summit Transl Bioinform ; 2010: 66-70, 2010 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-21347152

RESUMO

An integrated data repository (IDR) containing aggregations of clinical, biomedical, economic, administrative, and public health data is a key component of an overall translational research infrastructure. But most available data repositories are designed using standard data warehouse architecture that employs arbitrary data encoding standards, making queries across disparate repositories difficult. In response to these shortcomings we have designed a Health Ontology Mapper (HOM) that translates terminologies into formal data encoding standards without altering the underlying source data. We believe the HOM system promotes inter-institutional data sharing and research collaboration, and will ultimately lower the barrier to developing and using an IDR.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...